Definition and properties of accrual failure detectors : an overview

نویسندگان

  • Xavier Défago
  • Naohiro Hayashibara
  • Péter Urbán
  • Rami Yared
  • Takuya Katayama
چکیده

Ensuring fast and accurate failure detection is a fundamental issue for building efficient fault-tolerant distributed systems. In an effort to make fault-tolerant applications easier to implement, we are trying to provide failure detection as a generic Internet service, similar to what was done very successfully with NTP (network time protocol) for clock synchronization. To do so, we must revisit the interaction model between the failure detection service and the applications it serves. We have defined a novel failure detector abstraction called accrual failure detectors. The main different with traditional failure detectors is that, instead of outputing information of a binary nature (trust or suspect), accrual failure detectors output a level of confidence. Last year, we presented preliminary results on the implementation of a failure detector based on this principle [7]. In this presentation, we will show our recent advances on this issue. In particular, we will present the definition of accrual failure detectors. Such a definition is necessary for clearly establishing the relation with the already abundant work on failure detectors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Accrual Failure Detectors

Traditionally, failure detectors have considered a binary model whereby a given process can be either trusted or suspected. This paper defines a family of failure detectors, called accrual failure detectors, that revisits this interaction model. Accrual failure detectors associate to each process a real value representing a suspicion level. An important advantage of accrual failure detectors ov...

متن کامل

The Φ Accrual Failure Detector

Detecting failures is a fundamental issue for fault-tolerance in distributed systems. Recently, many people have come to realize that failure detection ought to be provided as some form of generic service, similar to IP address lookup or time synchronization. However, this has not been successful so far. One of the reasons is the difficulty to satisfy several application requirements simultaneo...

متن کامل

A Weibull distribution accrual failure detector for cloud computing

Failure detectors are used to build high availability distributed systems as the fundamental component. To meet the requirement of a complicated large-scale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. However, several implementations of accrual failure detectors do not adapt well to the cloud service environment. To solve ...

متن کامل

Low-Overhead Accrual Failure Detector

Failure detectors are one of the fundamental components for building a distributed system with high availability. In order to maintain the efficiency and scalability of failure detection in a complicated large-scale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. In this paper, an new accrual failure detector--LA-FD with low s...

متن کامل

LA - FD : a Low - overhead Accrual Failure Detector ?

Failure detector is one of the fundamental components for building a distributed system with high availability. In order to maintain the efficiency and scalability of failure detection in a complicate largescale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. In this paper, an accrual failure detector — LA-FD with low system o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005